Improving Collocation Correction by Ranking Suggestions Using Linguistic Knowledge
نویسندگان
چکیده
The importance of collocations in the context of second language learning is generally acknowledged. Studies show that the “collocation density" in learner corpora is nearly the same as in native corpora, i.e., that use of collocations by learners is as common as it is by native speakers, while the collocation error rate in learner corpora is about ten times as high as in native reference corpora. Therefore, CALL could be of great aid to support the learners for better mastering of collocations. However, surprisingly few works address specifically research on CALL-oriented collocation learning assistants that detect miscollocations in the writings of the learners and propose suggestions for their correction or that offer the learner the possibility to verify a word co-occurrence with respect to its correctness as collocation and obtain suggestions for its correction in case it is determined to be a miscollocation. This disregard is likely to be, on the one hand, due to the focus of the CALL research so far on grammatical matters, and, on the other hand, due to the complexity of the problem. In order to be able to provide an adequate correction of a miscollocation, the collocation learning assistant must “guess" the meaning that the learner intended to express. This makes it very different from grammar or spell checkers, which can draw on grammatical respectively orthographic regularities of a language. In this paper, we focus on the problem of the provision of a ranked list of correction suggestions in a context in which the learner submits a collocation for verification and obtains a list of correction suggestions in the case of a miscollocation. We show that the retrieval of the suggestions and their ranking benefits greatly from NLP techniques that provide the syntactic dependency structure and subcategorization information of the word co-occurrences and a weighted Pointwise Mutual Information (PMI) that reflects the fact that in a collocation, it is the base that is subject of the free choice of the speaker, while the occurrence of the collocate is restricted by the base, i.e., that collocations are per se asymmetric.
منابع مشابه
Proceedings of the third workshop on NLP for computer - assisted language learning
The importance of collocations in the context of second language learning is generally acknowledged. Studies show that the “collocation density" in learner corpora is nearly the same as in native corpora, i.e., that use of collocations by learners is as common as it is by native speakers, while the collocation error rate in learner corpora is about ten times as high as in native reference corpo...
متن کاملAutomated Suggestions for Miscollocations
One of the most common and persistent error types in second language writing is collocation errors, such as learn knowledge instead of gain or acquire knowledge, or make damage rather than cause damage. In this work-inprogress report, we propose a probabilistic model for suggesting corrections to lexical collocation errors. The probabilistic model incorporates three features: word association s...
متن کاملTCtract-A Collocation Extraction Approach for Noun Phrases Using Shallow Parsing Rules and Statistic Models
This paper presents a hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge. The algorithm first extracts all the noun phrase collocations from a shallow parsed corpus by using syntactic knowledge in the form of phrase rules. It then removes pseudo collocations by using a set of statistic-based association measures (...
متن کاملImproving Contextual Suggestions using Open Web Domain Knowledge
Contextual suggestion aims at recommending items to users given their current context, such as location-based tourist recommendations. Our contextual suggestion ranking model consists of two main components: selecting candidate suggestions and providing a ranked list of personalized suggestions. We focus on selecting appropriate suggestions from the ClueWeb12 collection using tourist domain kno...
متن کاملSpelling Correction Based on User Search Contextual Analysis and Domain Knowledge
We propose a spelling correction algorithm that combines trusted domain knowledge and query log information for query spelling correction. This algorithm uses query reformulations in the query log and bigram language models built from queries for efficiently and effectively generating correction suggestions and ranking them to find valid corrections. Experimental results show that for both simp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014